home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-03-27 | 35.9 KB | 883 lines | [TEXT/ROSA] |
- Common Lisp the Language, 2nd Edition
- -------------------------------------------------------------------------------
-
- 13. Characters
-
- Common Lisp provides a character data type; objects of this type represent
- printed symbols such as letters.
-
- In general, characters in Common Lisp are not true objects; eq cannot be
- counted upon to operate on them reliably. In particular, it is possible that
- the expression
-
- (let ((x z) (y z)) (eq x y))
-
- may be false rather than true, if the value of z is a character.
-
- -------------------------------------------------------------------------------
- Rationale: This odd breakdown of eq in the case of characters allows the
- implementor enough design freedom to produce exceptionally efficient code on
- conventional architectures. In this respect the treatment of characters exactly
- parallels that of numbers, as described in chapter 12.
- -------------------------------------------------------------------------------
-
-
-
- ------------------------------------------------------------------------------
- Table 13-1: Standard Character Labels, Glyphs, and Descriptions
-
- SM05 @ commercial at SD13 ` grave accent
- SP02 ! exclamation mark LA02 A capital A LA01 a small a
- SP04 " quotation mark LB02 B capital B LB01 b small b
- SM01 # number sign LC02 C capital C LC01 c small c
- SC03 $ dollar sign LD02 D capital D LD01 d small d
- SM02 % percent sign LE02 E capital E LE01 e small e
- SM03 & ampersand LF02 F capital F LF01 f small f
- SP05 ' apostrophe LG02 G capital G LG01 g small g
- SP06 ( left parenthesis LH02 H capital H LH01 h small h
- SP07 ) right parenthesis LI02 I capital I LI01 i small i
- SM04 * asterisk LJ02 J capital J LJ01 j small j
- SA01 + plus sign LK02 K capital K LK01 k small k
- SP08 , comma LL02 L capital L LL01 l small l
- SP10 - hyphen or minus sign LM02 M capital M LM01 m small m
- SP11 . period or full stop LN02 N capital N LN01 n small n
- SP12 / solidus LO02 O capital O LO01 o small o
- ND10 0 digit 0 LP02 P capital P LP01 p small p
- ND01 1 digit 1 LQ02 Q capital Q LQ01 q small q
- ND02 2 digit 2 LR02 R capital R LR01 r small r
- ND03 3 digit 3 LS02 S capital S LS01 s small s
- ND04 4 digit 4 LT02 T capital T LT01 t small t
- ND05 5 digit 5 LU02 U capital U LU01 u small u
- ND06 6 digit 6 LV02 V capital V LV01 v small v
- ND07 7 digit 7 LW02 W capital W LW01 w small w
- ND08 8 digit 8 LX02 X capital X LX01 x small x
- ND09 9 digit 9 LY02 Y capital Y LY01 y small y
- SP13 : colon LZ02 Z capital Z LZ01 z small z
- SP14 ; semicolon SM06 [ left square bracket SM11 { left curly bracket
- SA03 < less-than sign SM07 \ reverse solidus SM13 | vertical bar
- SA04 = equals sign SM08 ] right square bracket SM14 } right curly bracket
- SA05 > greater-than sign SD15 ^ circumflex accent SD19 ~ tilde
- SP15 ? question mark SP09 _ low line
- ------------------------------------------------------------------------------
-
- If two objects are to be compared for ``identity,'' but either might be a
- character, then the predicate eql is probably appropriate.
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to approve the following
- definitions and terminology for use in discussing character facilities in
- Common Lisp.
-
- A character repertoire defines a collection of characters independent of their
- specific rendered image or font. (This corresponds to the mathematical notion
- of a set, but the term character set is avoided here because it has been used
- in the past to mean both what is here called a repertoire and what is here
- called a coded character set.) Character repertoires are specified independent
- of coding and their characters are identified only with a unique character
- label, a graphic symbol, and a character description. As an example, table 13-1
- shows the character labels, graphic symbols, and character descriptions for all
- of the characters in the repertoire standard-char except for #\Space and
- #\Newline.
-
- Every Common Lisp implementation must support the standard character repertoire
- as well as repertoires named base-character, extended-character, and character.
- Other repertoires may be supported as well. X3J13 voted in June 1989
- (MORE-CHARACTER-PROPOSAL) to specify that names of repertoires may be used as
- type specifiers. Such types must be subtypes of character; that is, in a given
- implementation the repertoire named character must encompass all the character
- objects supported by that implementation.
-
- A coded character set is a character repertoire plus an encoding that provides
- a bijective mapping between each character in the set and a number (typically a
- non-negative integer) that serves as the character representation. There are
- numerous internationally standardized coded character sets.
-
- A character may be included in one or more character repertoires. Similarly, a
- character may be included in one or more coded character sets.
-
- To ensure that each character is uniquely defined, we may use a universal
- registry of characters that incorporates a collection of distinguished
- repertoires called character scripts that form an exhaustive partition of all
- characters. That is, each character is included in exactly one character
- script. (Draft ISO 10646 Coded Character Set Standard, if eventually approved
- as a standard, may become the practical realization of this universal
- registry.)
-
- (X3J13 voted in June 1989 (MORE-CHARACTER-PROPOSAL) to specify that an
- implementation must document the character scripts it supports. For each script
- the documentation should discuss character labels, glyphs, and descriptions;
- any canonicalization processes performed by the reader that result in treating
- distinct characters as equivalent; any canonicalization performed by format in
- processing directives; the behavior of char-upcase, char-downcase, and the
- predicates alpha-char-p, upper-case-p, lower-case-p, both-case-p,
- graphic-char-p, alphanumericp, char-equal, char-not-equal, char-lessp,
- char-greaterp, char-not-greaterp, and char-not-lessp for characters in the
- script; and behavior with respect to input and output, including coded
- character sets and external coding schemes.)
-
- In Common Lisp a character data object is identified by its character code, a
- unique numerical code. Each character code is composed from a character script
- and a character label. The convention by which a character script and character
- label compose a character code is implementation dependent. [X3J13 did not
- approve all parts of the proposal from its Subcommittee on Characters. As a
- result, some features that were approved appear to have no purpose. X3J13
- wished to support the standardization by ISO of character scripts and coded
- character sets but declined to design facilities for use in Common Lisp until
- there has been more progress by ISO in this area. The approval of the
- terminology for scripts and labels gives a hint to implementors of likely
- directions for Common Lisp in the future.]
-
- A character object that is classified as graphic, or displayable, has an
- associated glpyh. The glyph is the visual representation of the character. All
- other character data objects are classified as non-graphic.
-
- This terminology assigns names to Common Lisp concepts in a manner consistent
- with related concepts discussed in various ISO standards for coded character
- sets and provides a demarcation between standardization activities. For
- example, facilities for manipulating characters, character scripts, and coded
- character sets are properly defined by a Common Lisp standard, but Common Lisp
- should not define standard character sets or standard character scripts.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- * Character Attributes
- * Predicates on Characters
- * Character Construction and Selection
- * Character Conversions
- * Character Control-Bit Functions
-
- -------------------------------------------------------------------------------
-
- 13.1 Character Attributes
-
- Every character has three attributes: code, bits, and font. The code attribute
- is intended to distinguish among the printed glyphs and formatting functions
- for characters. The bits attribute allows extra flags to be associated with a
- character. The font attribute permits a specification of the style of the
- glyphs (such as italics).
-
- [change_begin]
- The treatment of character attributes in Common Lisp has not been entirely
- successful. The font attribute has not been widely used, for two reasons.
- First, a single integer, limited in most implementations to 255 at most, is not
- an adequate, convenient, or portable representation for a font. Second, in many
- applications where font information matters it is more convenient or more
- efficient to represent font information as shift codes that apply to many
- characters, rather than attaching font information separately to each
- character.
-
- As for the bits attribute, it was intended to support character input from
- extended keyboards having extra ``shift'' keys. This, in turn, was imagined to
- support the programming of a portable EMACS-like editor in Common Lisp. (The
- EMACS command set is most convenient when the keyboard has separate ``control''
- and ``meta'' keys.) The bits attribute has been used in the implementation of
- such editors and other interactive interfaces. However, software that relies
- crucially on these extended characters will not be portable to Common Lisp
- implementations that do not support them.
-
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) and in June 1989
- (MORE-CHARACTER-PROPOSAL) to revise considerably the treatment of characters
- in the language. The bits and font attributes are eliminated; instead a
- character may have implementation-defined attributes. The treatment of such
- attributes by existing character-handling functions is carefully constrained by
- certain rules.
-
- Implementations are free to continue to support bits and font attributes, but
- they are formally regarded as implementation-defined attributes. The rules are
- generally consistent with the previous treatment of the bits and font
- attributes. My guess is that the font attribute as currently defined will
- wither away, but the bits attribute as defined by the first edition will
- continue to be supported as a de facto standard extension, because it fills a
- useful small purpose.
- [change_end]
-
- [Constant]
- char-code-limit
-
- The value of char-code-limit is a non-negative integer that is the upper
- exclusive bound on values produced by the function char-code, which returns the
- code component of a given character; that is, the values returned by char-code
- are non-negative and strictly less than the value of char-code-limit.
-
- [change_begin]
- Common Lisp does not at present explicitly guarantee that all integers between
- zero and the value of char-code-limit are valid character codes, and so it is
- wise in any case for the programmer to assume that the space of assigned
- character codes may be sparse.
- [change_end]
-
- [old_change_begin]
-
- [Constant]
- char-font-limit
-
- The value of char-font-limit is a non-negative integer that is the upper
- exclusive bound on values produced by the function char-font, which returns the
- font component of a given character; that is, the values returned by char-font
- are non-negative and strictly less than the value of char-font-limit.
-
- -------------------------------------------------------------------------------
- Implementation note: No Common Lisp implementation is required to support
- non-zero font attributes; if it does not, then char-font-limit should be 1.
- -------------------------------------------------------------------------------
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate char-font-limit.
-
- Experience has shown that numeric codes are not an especially convenient, let
- alone portable, representation for font information. A system based on typeface
- names, type styles, and point sizes would be much better. (Macintosh software
- developers made the same discovery and have recently converted to a new font
- identification scheme.)
- [change_end]
-
- [old_change_begin]
-
- [Constant]
- char-bits-limit
-
- The value of char-bits-limit is a non-negative integer that is the upper
- exclusive bound on values produced by the function char-bits, which returns the
- bits component of a given character; that is, the values returned by char-bits
- are non-negative and strictly less than the value of char-bits-limit. Note that
- the value of char-bits-limit will be a power of 2.
-
- -------------------------------------------------------------------------------
- Implementation note: No Common Lisp implementation is required to support
- non-zero bits attributes; if it does not, then char-bits-limit should be 1.
- -------------------------------------------------------------------------------
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate char-bits-limit.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- 13.2. Predicates on Characters
-
- The predicate characterp may be used to determine whether any Lisp object is a
- character object.
-
- [Function]
- standard-char-p char
-
- The argument char must be a character object. standard-char-p is true if the
- argument is a ``standard character,'' that is, an object of type standard-char.
-
- Note that any character with a non-zero bits or font attribute is non-standard.
-
- [Function]
- graphic-char-p char
-
- The argument char must be a character object. graphic-char-p is true if the
- argument is a ``graphic'' (printing) character, and false if it is a
- ``non-graphic'' (formatting or control) character. Graphic characters have a
- standard textual representation as a single glyph, such as A or * or =. By
- convention, the space character is considered to be graphic. Of the standard
- characters all but #\Newline are graphic. The semi-standard characters
- #\Backspace, #\Tab, #\Rubout, #\Linefeed, #\Return, and #\Page are not graphic.
-
- Programs may assume that graphic characters of font 0 are all of the same width
- when printed, for example, for purposes of columnar formatting. (This does not
- prohibit the use of a variable-pitch font as font 0, but merely implies that
- every implementation of Common Lisp must provide some mode of operation in
- which font 0 is a fixed-pitch font.) Portable programs should assume that, in
- general, non-graphic characters and characters of other fonts may be of varying
- widths.
-
- Any character with a non-zero bits attribute is non-graphic.
-
- [old_change_begin]
- [Function]
- string-char-p char
-
- The argument char must be a character object. string-char-p is true if char can
- be stored into a string, and otherwise is false. Any character that satisfies
- standard-char-p also satisfies string-char-p; others may also.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate string-char-p.
- [change_end]
-
- [Function]
- alpha-char-p char
-
- The argument char must be a character object. alpha-char-p is true if the
- argument is an alphabetic character, and otherwise is false.
-
- If a character is alphabetic, then it is perforce graphic. Therefore any
- character with a non-zero bits attribute cannot be alphabetic. Whether a
- character is alphabetic may depend on its font number.
-
- Of the standard characters (as defined by standard-char-p), the letters A
- through Z and a through z are alphabetic.
-
- [Function]
- upper-case-p char
- lower-case-p char
- both-case-p char
-
- The argument char must be a character object.
-
- upper-case-p is true if the argument is an uppercase character, and otherwise
- is false.
-
- lower-case-p is true if the argument is a lowercase character, and otherwise is
- false.
-
- both-case-p is true if the argument is an uppercase character and there is a
- corresponding lowercase character (which can be obtained using char-downcase),
- or if the argument is a lowercase character and there is a corresponding
- uppercase character (which can be obtained using char-upcase).
-
- If a character is either uppercase or lowercase, it is necessarily alphabetic
- (and therefore is graphic, and therefore has a zero bits attribute). However,
- it is permissible in theory for an alphabetic character to be neither uppercase
- nor lowercase (in a non-Roman font, for example).
-
- Of the standard characters (as defined by standard-char-p), the letters A
- through Z are uppercase and a through z are lowercase.
-
- [Function]
- digit-char-p char &optional (radix 10)
-
- The argument char must be a character object, and radix must be a non-negative
- integer. If char is not a digit of the radix specified by radix, then
- digit-char-p is false; otherwise it returns a non-negative integer that is the
- ``weight'' of char in that radix.
-
- Digits are necessarily graphic characters.
-
- Of the standard characters (as defined by standard-char-p), the characters 0
- through 9, A through Z, and a through z are digits. The weights of 0 through 9
- are the integers 0 through 9, and of A through Z (and also a through z) are 10
- through 35. digit-char-p returns the weight for one of these digits if and only
- if its weight is strictly less than radix. Thus, for example, the digits for
- radix 16 are
-
- 0 1 2 3 4 5 6 7 8 9 A B C D E F
-
- Here is an example of the use of digit-char-p:
-
- (defun convert-string-to-integer (str &optional (radix 10))
- "Given a digit string and optional radix, return an integer."
- (do ((j 0 (+ j 1))
- (n 0 (+ (* n radix)
- (or (digit-char-p (char str j) radix)
- (error "Bad radix-~D digit: ~C"
- radix
- (char str j))))))
- ((= j (length str)) n)))
-
- [Function]
- alphanumericp char
-
- The argument char must be a character object. alphanumericp is true if char is
- either alphabetic or numeric. By definition,
-
- (alphanumericp x)
- == (or (alpha-char-p x) (not (null (digit-char-p x))))
-
- Alphanumeric characters are therefore necessarily graphic (as defined by the
- predicate graphic-char-p).
-
- Of the standard characters (as defined by standard-char-p), the characters 0
- through 9, A through Z, and a through z are alphanumeric.
-
- [Function]
- char= character &rest more-characters
- char/= character &rest more-characters
- char< character &rest more-characters
- char> character &rest more-characters
- char<= character &rest more-characters
- char>= character &rest more-characters
-
- The arguments must all be character objects. These functions compare the
- objects using the implementation-dependent total ordering on characters, in a
- manner analogous to numeric comparisons by = and related functions.
-
- The total ordering on characters is guaranteed to have the following
- properties:
-
- * The standard alphanumeric characters obey the following partial ordering:
-
- A<B<C<D<E<F<G<H<I<J<K<L<M<N<O<P<Q<R<S<T<U<V<W<X<Y<Z
- to 0pta<b<c<d<e<f<g<h<i<j<k<l<m<n<o< p<q<r<s<t<u<v<w<x<y<z
- 0<1<2<3<4<5<6<7<8<9
- either 9<A or Z<0
- either 9<a or z<0
-
- This implies that alphabetic ordering holds within each case (upper and
- lower), and that the digits as a group are not interleaved with letters.
- However, the ordering or possible interleaving of uppercase letters and
- lowercase letters is unspecified. (Note that both the ASCII and the EBCDIC
- character sets conform to this specification. As it happens, neither
- ordering interleaves uppercase and lowercase letters: in the ASCII
- ordering, 9<A and Z<a, whereas in the EBCDIC ordering z<A and Z<0.)
-
- [old_change_begin]
-
- * If two characters have the same bits and font attributes, then their
- ordering by char< is consistent with the numerical ordering by the
- predicate < on their code attributes.
-
- * If two characters differ in any attribute (code, bits, or font), then
- they are different.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits
- and font attributes with that of implementation-defined attributes.
-
- * If two characters have identical implementation-defined attributes, then
- their ordering by char< is consistent with the numerical ordering by the
- predicate < on their codes, and similarly for char>, char<=, and char>=.
-
- * If two characters differ in any implementation-defined attribute, then
- they are not char=.
-
- [change_end]
-
- The total ordering is not necessarily the same as the total ordering on the
- integers produced by applying char-int to the characters (although it is a
- reasonable implementation technique to use that ordering).
-
- While alphabetic characters of a given case must be properly ordered, they need
- not be contiguous; thus (char<= #\a x #\z) is not a valid way of determining
- whether or not x is a lowercase letter. That is why a separate lower-case-p
- predicate is provided.
-
- (char= #\d #\d) is true.
- (char/= #\d #\d) is false.
- (char= #\d #\x) is false.
- (char/= #\d #\x) is true.
- (char= #\d #\D) is false.
- (char/= #\d #\D) is true.
- (char= #\d #\d #\d #\d) is true.
- (char/= #\d #\d #\d #\d) is false.
- (char= #\d #\d #\x #\d) is false.
- (char/= #\d #\d #\x #\d) is false.
- (char= #\d #\y #\x #\c) is false.
- (char/= #\d #\y #\x #\c) is true.
- (char= #\d #\c #\d) is false.
- (char/= #\d #\c #\d) is false.
- (char< #\d #\x) is true.
- (char<= #\d #\x) is true.
- (char< #\d #\d) is false.
- (char<= #\d #\d) is true.
- (char< #\a #\e #\y #\z) is true.
- (char<= #\a #\e #\y #\z) is true.
- (char< #\a #\e #\e #\y) is false.
- (char<= #\a #\e #\e #\y) is true.
- (char> #\e #\d) is true.
- (char>= #\e #\d) is true.
- (char> #\d #\c #\b #\a) is true.
- (char>= #\d #\c #\b #\a) is true.
- (char> #\d #\d #\c #\a) is false.
- (char>= #\d #\d #\c #\a) is true.
- (char> #\e #\d #\b #\c #\a) is false.
- (char>= #\e #\d #\b #\c #\a) is false.
- (char> #\z #\A) may be true or false.
- (char> #\Z #\a) may be true or false.
-
- There is no requirement that (eq c1 c2) be true merely because (char= c1 c2) is
- true. While eq may distinguish two character objects that char= does not, it is
- distinguishing them not as characters, but in some sense on the basis of a
- lower-level implementation characteristic. (Of course, if (eq c1 c2) is true,
- then one may expect (char= c1 c2) to be true.) However, eql and equal compare
- character objects in the same way that char= does.
-
- [Function]
- char-equal character &rest more-characters
- char-not-equal character &rest more-characters
- char-lessp character &rest more-characters
- char-greaterp character &rest more-characters
- char-not-greaterp character &rest more-characters
- char-not-lessp character &rest more-characters
-
- [old_change_begin]
- The predicate char-equal is like char=, and similarly for the others, except
- according to a different ordering such that differences of bits attributes and
- case are ignored, and font information is taken into account in an
- implementation-dependent manner.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits
- and font attributes with that of implementation-defined attributes. The effect,
- if any, of each such attribute on the behavior of char-equal, char-not-equal,
- char-lessp, char-greaterp, char-not-greaterp, and char-not-lessp must be
- specified as part of the definition of that attribute.
- [change_end]
-
- For the standard characters, the ordering is such that A=a, B=b, and so on, up
- to Z=z, and furthermore either 9<A or Z<0. For example:
-
- (char-equal #\A #\a) is true.
- (char= #\A #\a) is false.
- (char-equal #\A #\Control-A) is true.
-
- [old_change_begin]
- The ordering may depend on the font information. For example, an implementation
- might decree that (char-equal #\p #\p) be true, but that (char-equal #\p #\pi)
- be false (where #\pi is a lowercase p in some font). Assuming italics to be in
- font 1 and the Greek alphabet in font 2, this is the same as saying that
- (char-equal #0\p #1\p) may be true and at the same time (char-equal #0\p #2\p)
- may be false.
- [old_change_end]
-
- -------------------------------------------------------------------------------
-
- 13.3. Character Construction and Selection
-
- These functions may be used to extract attributes of a character and to
- construct new characters.
-
- [Function]
- char-code char
-
- The argument char must be a character object. char-code returns the code
- attribute of the character object; this will be a non-negative integer less
- than the (normal) value of the variable char-code-limit.
-
- [change_begin]
- This is usually what you need in order to treat a character as an index into a
- vector. The length of the vector should then be equal to char-code-limit. Be
- careful how you initialize this vector; remember that you cannot necessarily
- expect all non-negative integers less than char-code-limit to be valid
- character codes.
- [change_end]
-
- [old_change_begin]
-
- [Function]
- char-bits char
-
- The argument char must be a character object. char-bits returns the bits
- attribute of the character object; this will be a non-negative integer less
- than the (normal) value of the variable char-bits-limit.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate char-bits.
- [change_end]
-
- [old_change_begin]
-
- [Function]
- char-font char
-
- The argument char must be a character object. char-font returns the font
- attribute of the character object; this will be a non-negative integer less
- than the (normal) value of the variable char-font-limit.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate char-font.
-
- The references to the ``normal'' values of the ``variables'' char-code-limit,
- char-bits-limit, and char-font-limit in the descriptions of char-code,
- char-bits, and char-font were an oversight on my part. Early in the design of
- Common Lisp they were indeed variables, but they are at present defined to be
- constants, and their values therefore are always normal and should not change.
- But this point is now moot.
- [change_end]
-
- [Function]
- code-char code &optional (bits 0) (font 0)
-
- [old_change_begin]
- All three arguments must be non-negative integers. If it is possible in the
- implementation to construct a character object whose code attribute is code,
- whose bits attribute is bits, and whose font attribute is font, then such an
- object is returned; otherwise nil is returned.
-
- For any integers c, b, and f, if (code-char c b f) is not nil then
-
- (char-code (code-char c b f)) => c
- (char-bits (code-char c b f)) => b
- (char-font (code-char c b f)) => f
-
- If the font and bits attributes of a character object c are zero, then it is
- the case that
-
- (char= (code-char (char-code c)) c)
-
- is true.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the bits and font
- arguments from the specification of code-char.
- [change_end]
-
- [old_change_begin]
-
- [Function]
- make-char char &optional (bits 0) (font 0)
-
- The argument char must be a character, and bits and font must be non-negative
- integers. If it is possible in the implementation to construct a character
- object whose code attribute is the same as the code attribute of char, whose
- bits attribute is bits, and whose font attribute is font, then such an object
- is returned; otherwise nil is returned.
-
- If bits and font are zero, then make-char cannot fail. This implies that for
- every character object one can ``turn off'' its bits and font attributes.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate make-char.
- [change_end]
-
- -------------------------------------------------------------------------------
-
- 13.4. Character Conversions
-
- These functions perform various transformations on characters, including case
- conversions.
-
- [Function]
- character object
-
- The function character coerces its argument to be a character if possible; see
- coerce.
-
- (character x) == (coerce x 'character)
-
- [Function]
- char-upcase char
- char-downcase char
-
- The argument char must be a character object. char-upcase attempts to convert
- its argument to an uppercase equivalent; char-downcase attempts to convert its
- argument to a lowercase equivalent.
-
- [old_change_begin]
- char-upcase returns a character object with the same font and bits attributes
- as char, but with possibly a different code attribute. If the code is different
- from char's, then the predicate lower-case-p is true of char, and upper-case-p
- is true of the result character. Moreover, if (char= (char-upcase x) x) is not
- true, then it is true that
-
- (char= (char-downcase (char-upcase x)) x)
-
- Similarly, char-downcase returns a character object with the same font and bits
- attributes as char, but with possibly a different code attribute. If the code
- is different from char's, then the predicate upper-case-p is true of char, and
- lower-case-p is true of the result character. Moreover, if (char=
- (char-downcase x) x) is not true, then it is true that
-
- (char= (char-upcase (char-downcase x)) x)
-
- [old_change_end]
-
- Note that the action of char-upcase and char-downcase may depend on the bits
- and font attributes of the character. In particular, they have no effect on a
- character with a non-zero bits attribute, because such characters are by
- definition not alphabetic. See alpha-char-p.
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits
- and font attributes with that of implementation-defined attributes. The effect
- of char-upcase and char-downcase is to preserve implementation-defined
- attributes.
- [change_end]
-
- [Function]
- digit-char weight &optional (radix 10) (font 0)
-
- All arguments must be integers. digit-char determines whether or not it is
- possible to construct a character object whose font attribute is font, and
- whose code is such that the result character has the weight weight when
- considered as a digit of the radix radix (see the predicate digit-char-p). It
- returns such a character if that is possible, and otherwise returns nil.
-
- digit-char cannot return nil if font is zero, radix is between 2 and 36
- inclusive, and weight is non-negative and less than radix.
-
- If more than one character object can encode such a weight in the given radix,
- one will be chosen consistently by any given implementation; moreover, among
- the standard characters, uppercase letters are preferred to lowercase letters.
- For example:
-
- (digit-char 7) => #\7
- (digit-char 12) => nil
- (digit-char 12 16) => #\C ;not #\c
- (digit-char 6 2) => nil
- (digit-char 1 2) => #\1
-
- Note that no argument is provided for specifying the bits component of the
- returned character, because a digit cannot have a non-zero bits component. The
- reasoning is that every digit is graphic (see digit-char-p) and no graphic
- character has a non-zero bits component (see graphic-char-p).
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the font argument
- from the specification of digit-char.
- [change_end]
-
- [Function]
- char-int char
-
- The argument char must be a character object. char-int returns a non-negative
- integer encoding the character object.
-
- If the font and bits attributes of char are zero, then char-int returns the
- same integer char-code would. Also,
-
- (char= c1 c2) == (= (char-int c1) (char-int c2))
-
- for characters c1 and c2.
-
- This function is provided primarily for the purpose of hashing characters.
-
- [old_change_begin]
-
- [Function]
- int-char integer
-
- The argument must be a non-negative integer. int-char returns a character
- object c such that (char-int c) is equal to integer, if possible; otherwise
- int-char returns false.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate int-char.
- [change_end]
-
- [Function]
- char-name char
-
- The argument char must be a character object. If the character has a name, then
- that name (a string) is returned; otherwise nil is returned. All characters
- that have zero font and bits attributes and that are non-graphic (do not
- satisfy the predicate graphic-char-p) have names. Graphic characters may or may
- not have names.
-
- The standard newline and space characters have the respective names Newline and
- Space. The semi-standard characters have the names Tab, Page, Rubout, Linefeed,
- Return, and Backspace.
-
- Characters that have names can be notated as #\ followed by the name. (See
- section 22.1.4.) Although the name may be written in any case, it is stylish to
- capitalize it thus: #\Space.
-
- char-name will only locate ``simple'' character names; it will not construct
- names such as Control-Space on the basis of the character's bits attribute.
-
- [change_begin]
- The easiest way to get a name that includes the bits attribute of a character c
- is (format nil "~:C" c).
- [change_end]
-
- [Function]
- name-char name
-
- The argument name must be an object coerceable to a string as if by the
- function string. If the name is the same as the name of a character object (as
- determined by string-equal), that object is returned; otherwise nil is
- returned.
-
- -------------------------------------------------------------------------------
-
- 13.5. Character Control-Bit Functions
-
- [old_change_begin]
- Common Lisp provides explicit names for four bits of the bits attribute:
- Control, Meta, Hyper, and Super. The following definitions are provided for
- manipulating these. Each Common Lisp implementation provides these functions
- for compatibility, even if it does not support any or all of the bits named
- below.
-
- [Constant]
- char-control-bit
- char-meta-bit
- char-super-bit
- char-hyper-bit
-
- The values of these named constants are the ``weights'' (as integers) for the
- four named control bits. The weight of the control bit is 1; of the meta bit,
- 2; of the super bit, 4; and of the hyper bit, 8.
-
- If a given implementation of Common Lisp does not support a particular bit,
- then the corresponding constant is zero instead.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate all four of the
- constants char-control-bit, char-meta-bit, char-super-bit, and char-hyper-bit.
-
- When Common Lisp was first designed, keyboards with ``extra bits'' were
- relatively rare. The bits attribute was originally designed to support input
- from keyboards in use at Stanford and M.I.T. circa 1981.
-
- Since that time such extended keyboards have come into wider use. Notable here
- are the keyboards associated with certain personal computers and workstations.
- For example, in some specific applications the command and option keys of Apple
- Macintosh keyboards have had the connotations of control and meta. Macintosh II
- extended keyboards also have keys marked control whose use is analogous to that
- of hyper on the old M.I.T. keyboards. IBM PC personal computer keyboards have
- alt keys that function much like meta keys; similarly, keyboards on Sun
- workstations have keys very much like meta keys but labelled left and right.
- [change_end]
-
- [old_change_begin]
-
- [Function]
- char-bit char name
-
- char-bit takes a character object char and the name of a bit, and returns
- non-nil if the bit of that name is set in char, or nil if the bit is not set in
- char. For example:
-
- (char-bit #\Control-X :control) => true
-
- Valid values for name are implementation-dependent, but typically are :control,
- :meta, :hyper, and :super. It is an error to give char-bit the name of a bit
- not supported by the implementation.
-
- If the argument char is specified by a form that is a place form acceptable to
- setf, then setf may be used with char-bit to modify a bit of the character
- stored in that place. The effect is to perform a set-char-bit operation and
- then store the result back into the place.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate char-bit.
- [change_end]
-
- [old_change_begin]
-
- [Function]
- set-char-bit char name newvalue
-
- char-bit takes a character object char, the name of a bit, and a flag. A
- character is returned that is just like char except that the named bit is set
- or reset according to whether newvalue is non-nil or nil. Valid values for name
- are implementation-dependent, but typically are :control, :meta, :hyper, and
- :super. For example:
-
- (set-char-bit #\X :control t) => #\Control-X
- (set-char-bit #\Control-X :control t) => #\Control-X
- (set-char-bit #\Control-X :control nil) => #\X
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate set-char-bit.
- [change_end]
-
- -------------------------------------------------------------------------------
-
-
-
-
-